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Abstract 


Bayesian  estimators  for  the  parameters  of  the  finite 
state  Markov  renewal  process  are  developed,  both  for  the 
waiting  time  distributions  and  the  transition  probabilities. 
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1.  INTRODUCTION 


--^This  paper  develops  Bayesian  statistical  estimation 
procedures  for  the  finite  state  Markov  renewal  process.  The 
general  case  is  treated  where  uncertainty  exists  about  both 
the  waiting  time  distributions  and  the  transition  probabilities. 
This  work  extends  the  Bayesian  results  of  Martin  (1967)  ,  who 
only  considers  the  Markov  chain  case,  and  Brock  (1971),  who 
assumes  that  the  waiting  time  distributions  are  known.  Moore 
and  Pyke  (1968)  deal  only  with  classical  estimation  methods. 

The  sampling  schemes  considered  are  either  (1)  to  observe  n 
transitions  and  their  associated  waiting  times,  or,  more  generally, 
9-f^-to  observe  the  process  for  some  time  T,  where  T  is  not 
necessarily  a  transition  time.  K 

2.  THE  MARKOV  RENEWAL  PROCESS  MODEL 

The  Markov  renewal  process  is  a  convenient  and  workable 
generalization  of  both  a  Markov  process  and  a  renewal  process, 
incorporating  the  essential  features  of  each.  It  is  closely 
related  to  the  semi-Markov  process  independently  investigated  by 
Levy  [195^] >  Smith  [195^]  and  Takics  [195^]*  Essentially,  a  finite 
state  semi-Markov  process  is  a  stochastic  process  modelling 
moves  among  a  finite  number  of  states  with  the  successive  states 
visited  forming  a  Markov  chain  and  the  length  of  stay  in  a  given 
state  being  a  random  variable,  the  distribution  function  of  which 
may  depend  on  this  (origin)  state  as  well  as  on  the  one  to  be 
visited  next.  The  finite  state  semi-Markov  process  can  then  be 
thought  of  as  a  Markov  chain  for  which  the  time  scale  has  been 


2 


randomly  transformed. 

As  noted  by  Pyke  [1961a]  the  semi-Markov  process  is  equiv¬ 
alent  to  the  Markov  renewal  process  which  records  at  each  time 
t  the  number  of  times  an  entity  has  visited  each  of  the  possible 
states  up  to  time  t,  if  the  entity  moves  from  state  to  state 
according  to  a  Markov  chain,  and  if  the  time  required  for  each 
successive  move  is  a  random  variable  whose  distribution  function 
may  depend  on  the  two  states  between  which  the  move  is  being 
made.  An  early  application  of  the  semi-Markov  process  was  pur¬ 
sued  by  Cane  [1959].  Its  application  to  the  social  science 
phenomena  of  social  mobility  and  migration  is  outlined  by  Ginsberg 
[1971,  1972a, b]  and  it  is  reviewed  briefly  by  Bartholomew  [1973]. 
Statistical  inference  questions  about  the  transition  probabili¬ 
ties  are  considered  by  Moore  and  Pyke  [1968],  Building  on  the 
work  of  Martin  [1967],  Brock  [1971]  examines  a  Bayesian  pro¬ 
cedure  for  inference  about  the  transition  probabilities. 

We  shall  deal  with  a  finite  state  space,  labelled 
S  =  {l,2,...,s},  and  we  let  Q^(t)  denote  the  probability 
that  after  making  a  transition  into  state  i,  the  process  next 
makes  a  transition  into  state  j.  In  an  amount  of  time  less 
than  or  equal  to  t.  Note  explicitly  that  the  Markov  renewal 
model  allows  for  the  possibility  of  transitions  from  state  i 
to  state  i.  Such  transitions  may  or  may  not  have  substantive 
meaning  in  any  particular  application.  For  example,  in  migra¬ 
tion  studies  a  move  within  a  particular  geographic  region 
(state  i)  might  signal  a  transition.  On  the  other  hand.  In 
reliability  studies  the  states  usually  reflect  the  operating 
status  of  a  system,  and  a  "within  state"  transition  may  have  no 


0 .  But,  in 


physical  meaning.  In  this  later  case  Q  ^(t) 

general,  we  must  have  0  (t)  ^  0,  i,J  =  1,.. 

^  s 

=  11m  Q,  ,(t)  and  note  that  2 
t-*’06  J  =  1 

If  P  >  0,  let 


. ,s;  t  i  0. 


hr1- 1=1- 


Let 

•  y  S  • 


(t) 


Q±1 ( t ) 


U 


(1) 


(If  P  =0,  let 


Fij(t) 


be  arbitrary.) 


With  this  notation  represents  the  probability  that  the 

next  transition  will  be  into  state  j,  given  that  the  process 
has  just  entered  state  i,  and  F^(t)  represents  the  conditional 
probability  that  a  transition  will  take  place  within  an  amount 
of  time  t,  given  that  the  process  has  just  entered  i  and  will 
next  enter  j.  When  i  is  entered,  the  next  state  is  chosen 
according  to  the  transition  probabilities  P^ ;  then  given  that 
the  state  chosen  is  j,  the  time  until  transition  has  a  distri¬ 
bution  F^j  ( • ) .  These  quantities  P.,  ,  and  F^j(t)  are  estimable 
from  data  through  the  transition  frequencies  and  observed  wait¬ 
ing  times  between  the  various  transitions.  We  begin  the  estima¬ 
tion  process  in  the  next  section. 


3.  THE  LIKELIHOOD  FUNCTION 

If  the  Markov  renewal  process  is  observed  through  time  T, 
during  which  n  transitions  take  place,  the  data  will  have  the 
form 


O'o'1!’*!””  ,Tn,Xn,W)  ‘(x0,t1,x1. 


ww) 


where  XQ  is  the  state  initially  occupied,  is  the  state 
occupied  after  the  i^  transition,  and  is  the  waiting 

time  between  the  (i-l)st  and  ic^  transition.  The  likeli¬ 
hood  function  is  then  proportional  to 


where  q1j(t)  is  the  probability  density  function  corres¬ 
ponding  to  Q^jCt)  and 

n 

w  =  T  -  l  t,  . 

i=l  1 

Using  equation  (1),  L  can  be  rewritten  as 


f  x  <Ci> 
xi-l’xi  x 


xi-l' 


(w)P, 


where  f^Ct)  is  the  probability  density  function  corresponding 
to  Fjj  (t) •  This  is  the  same  as  the  likelihood  function  derived 
by  Moore  and  Pyke  (1968). 


In  most  practical  applications  there  will  not  be  enough 
data  to  adequately  estimate  the  waiting  time  distribution  func¬ 
tions  F  directly,  with  no  restrictions  placed  on  F.  Thus  it 
will  not  usually  be  feasible  to  estimate  F  through  the  empir¬ 
ical  distribution  function.  Instead  it  will  be  necessary  to 
restrict  the  class  of  distribution  functions  to  some  general 
parametric  family  and  then  estimate  the  indexing  parameters  of 


this  family.  It  will  also  be  true  that  in  most  practical 
applications  F  will  be  absolutely  continuous.  We  there¬ 
fore  assume  that  F  is  a  member  of  some  parametric  family 


of  continuous  distributions,  indexed  by  the  parameter  vector 

a 

§  .  We  will  then  write  as  FT",  the  distribution  function  of 

the  waiting  times  between  transitions  from  state  i  to  state 
9 

j ,  with  f 7\  denoting  the  corresponding  density  function. 

Then  letting  nru  denote  the  number  of  observed  transitions 
from  state  r  to  state  u,  L  can  be  further  rewritten  as 


L  ■  l5  lp  -  jli 


Le  Fx 


(w) 


n  j 


L  P 
B  x 


n  j 


(2) 


where 


s  nru 
LS  '  r,u*l  kll  f™(Cru(k)>  • 


-0 


with  tru(^c)  being  the  observed  waiting  time  between  the 
(k-l)s  and  k  transition  from  state  r  to  state  u,  and 


L_  = 


s 

n 

r  ,u=l 


n 

s  ru 
ru 


Thus  the  sufficient  statistic  for  this  model  is  a  vector 

Z=  (W,  Z^,Z^),  where  Z^  denotes  the  vector  of  observed 

(2) 

waiting  times,  t^Ck) ,  between  pairs  of  states,  and 
denotes  the  vector  of  transition  counts  nru> 


o 


4.  THE  FORM  OF  BAYES  ESTIMATORS 


We  consider  a  Bayesian  treatment  of  the  estimation  problem. 
This  is  desirable  because  of  the  flexibility  of  the  Bayesian 
approach  in  incorporating  varying  amounts  of  prior  information  and 
its  success  in  handling  the  problem  of  limited  amounts  of  relevant 
sampling  data. 

We  now  derive  Bayes  estimators  of  P  and  9  using  squared 
error  loss.  Thus  we  require  the  mean  of  the  posterior  distribu¬ 
tion  of  P  and  9.  Let  the  prior  joint  density  of  P  and  6 
be  denoted  by  t.  Then  the  Bayes  estimator  of  P  is 


E(P|z) 


j PL^dFde 
/ L  *  dFda 


The  form  of  the  Bayes  estimator  will  simplify  substantially 
if  we  assume  prior  independence  of  P  and  9,  i.e.,  that  -  can 
be  written  in  the  form 


JT=  n(  P  )  •  rr  (9). 


By  using  equation  (2)  we  can  rewrite  the  normalizing  constant, 
/L  * dPd9  ,  as  follows: 

First,  to  simplify  notation,  let 


F8 

x 


(w)  = 


n** 


Fj 


and 


xn,J 


Then 


/  L  tt  dPd9  = 

/LaLr,ff  ( P)  it (  9  )dPd6 
1  9  r  -  -  -  - 


s  @ 

l  /  L.F^LpP  .ir(P)ir(9)dPd0 

_  -I 


7 


[jVC.)d.  /  L?  T(p)dpJ-  f^|7  Lgpf  T(e)de  /  Lp  Pj  T(P)dpJ  . 


(^ ) 


Similarly , 
/PL  ~dF  d0  = 


(7  T(e)de  j/  PJyr(p)QP  -l  j  LgP®T(e)dej  [/ PLg  PjT(P)dP  . 


(5) 


Tc  obtain  the  Bayes  estimator  of  9  we  need  to  compute  the 
posterior  expectation, 


E6  = 


J  SLird:d9 
/  L  tt  d  ?  d  6 


(5) 


The  numerator  of  equation  (6)  can  be  rewritten  as 


/  SLrrdPd  8  - 


JaL  i( 9 )d9 

—  H  *>-  ** 


/Lpu(P)dP 


s 

I 

J-l 


/ 9L  F  j  tt  (  9  )  d9 

~  V  J 


/LpP  <  TT  ( P )  dpj 
_  0  J 


f  ^  \ 
\  I  ) 


Note  that  under  the  assumption  of  independent  prior  distributions 
for  ?  and  0  the  calculation  of  Bayes  estimates  involves  separate 
evaluations  of  integrals  with  respect  to  9  and  integrals  with 
respect  to  P.  Also  note  the  substantial  simplication  that  occurs 
when  the  terminal  time  T  happens  to  be  a  transition  time,  i.e., 
when  w-0 .  In  this  case  the  Bayes  estimators  are  just 


?*  =  E(?|Z(2' ) 


/PLpir(P)dP 
J  Lp  n ( P )dP  5 

L 


and 


9*  =  E(9 jZ(1)) 


/ §L  TT  (  g  )dg 

/L§^(§)d6 


In  the  general  case  where  w>_0  the  Bayes  estimators 
take  a  form  adjusted  away  from  P*  and  9*.  Specifically, 
using  equations  (4)  and  (5),  we  can  write  the  Bayes  estimator 
P  as 


wnere 


and 


P*  -  l  o.S(PP.  |Z(2)) 
1=1  J  '  J  - 

_ J _ i _ I _ I _ 

i  -  I  a  .  8  . 

J-l  "  J 


a  .  =  E(?f  |  Z(  1/1  1 
j  J  1  - 


6j  =  E(Pj:|Z(2)) 


The  Bayes  estimator  9  can  be  written  using  equations  (4)  and 
( 7 )  as 


9*  -  l  8 .E(eF> | Z(1)  ) 
,1=1  ,J  _ 

1  -  ^  a-6j 

i=l  J  « 


The  practical  use  of  the  Bayes  estimators  P  and  9 
is  limited  by  the  fact  that  integration  of  functions  of  9 
involving  F-  is  required,  and  F-  is  generally  a  quite 
complicated  function  of  9.  An  alternative  pair  of  estimators 


P  and  9  can  be  derived  by  replacing 


V?  4 


n  equations  (c;  and 


0  *  0  * 

k9j  with  the  constant  (over  §),  F-  ’..Vicing  then  Ft 

$ 

Just  a.,  we  have  the  simplified  estimators. 


9  =  E( e  I  x  '  ) 


{  =  e* ) 


?*  -  l  a  ?  E  (  ?  P ,  | Z ' ~ ) 

4  —  **  u  J 


1  -  l  a  ?  S  , 
j=l  ^ 

5.  MODELS  ?CR  WAITING  TIME  DISTRIBUTIONS 

We  seek  parametric  families  cf  continuous  positive  random 
variables  which  are  sufficiently  rich  to  represent  a  wide  range 
of  possible  waiting  time  distributions,  while  at  the  same  time 
being  analytically  tractable  in  terms  of  the  Bayesian  estimaticr 


:  c  scares 


ve  loped  in  Section  «.  Two  families  of  dist  ritur  i  cr.s  , 


each  specified  by  2  parameters,  immediately  suggest  tr.emselves 
for  this  purpose:  the  lognormal  and  the  gamma.  They  are  cis- 
cussed  in  Johnson  and  Koto  [1970]  and  Mann,  Schafer,  and  Sir.gp: 
walla  [1979].  A  random  variable  T  is  said  to  have  a  .two- 
parameter)  lognormal  distribution  if  the  natural  logarithm  of 
m  has  a  normal  distribution  (see  Aitchison  and  Brown  [1957]). 

Thus  the  distribution  of  T  is  determined  by  the  mean  u  and 

2  2 
variance  of  a  of  the  normal  distribution.  When  a  is  smai- 


;he  distribution  of 


;seif  will  be  not  unlike  that 


normal  distribution.  The  gamma  family  specifies  that  the  croc- 
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ability  density  function  of  T  would  be  given  by 

=  3ata“~e~3t 

*t""  rT^l 

for  t>C  where  a  and  6  are  positive  parameters.  Here  again 
it  is  possible  to  transform  the  random,  variable  T  to  a  very 
good  approximation  of  normality  for  only  moderately  large  a. 
The  Wilson  Hilferty  [1931]  approximation  can  be  used  to  show 
that  T“7 J  has  approximately  a  normal  distribution  with  mean 
U/S  w  "(l-l/9a)  and  variance  (  a/6  )  ~/  ^/9a  . 

It  also  should  be  noted  that  the  distribution  of  leg  T 
is  more  nearly  normal  than  the  distribution  of  7,  sc  that  a 


:1c 

~  rar.s 

formation  to  norma 

-  ,•  r 

---j  -a 

ideal  in  the  lognormal 

ion 

case 

and  helps  in  the 

^amrr.a  di 

stributicr.  case. 

ube 

root 

transformation  is 

best  in 

the  gamma  distribution 

case . 

From,  a  strictly  empirical  viewpoint  it  may  be  desirable  to 
use  the  data  to  choose  a  transformation  to  normality  from  among 
the  power  transformations  considered  by  Box  and  Cox  [19b-]. 

These  include  the  cube  root  transformation;  the  logarithmic 
transformation  is  a  limiting  case.  Thus  we  see  that  it  will 
generally  be  possible  to  transform  the  waiting  time  data  to 
achieve  approximate  normality. 

Therefore  we  assume  that  it  is  possible  to  deal  with  trans¬ 
formed  values  u  of  the  original  waiting  times  T,  these  trans¬ 
formed  values  U  having  approximate  normal  dist rifcut ior.s .  The 
orobiem  of  statistical  inference  about  the  distribution  of  wait¬ 
ing-  times  then  becomes  one  of  inference  about  the  means  and 
variances  of  the  normal  random  variables  ". 
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o.  BAYES  ESTIMATES  OF  THE  WAITING  TIME  DISTRIBUTIONS 


Suppose  that  it  is  possible  to  transform  the  vector  of 

all  waiting  tines  T  (s~xl)  to  a  multivariate  normal  random 
2 

vector  U  (s  *1) .  We  seek  to  estimate  the  mean  vector  u 

o  2  ? 

( s" *1 '  and  ohe  variance  matrix  t  (s  xs“)  of  U. 

Assume  that  a  prior  distribution  for  y  and  the  precision 
matrix  P.  =  |-i  is  chosen  from  the  natural  conjugate  family 
of  normal-Wishart  distributions.  We  then  assume  that  the  prior 
distribution  has  the  following  structure:  The  conditional  dis¬ 
tribution  of  y  given  that  R  =  r  is  a  multivariate  normal 

distribution  with  mean  vector  y_  (s“xl)  and  orecisicn  matrix 

~  J 

vQr(v0>  0)  ,  and  the  marginal  distribution  of  R  is  a  Wisnart 

2 

distribution  with  aQ(ao>s  -1)  degrees  of  freedom  and  positive 

2  2 

definite  precision  matrix  t_  (s‘*s  ).  Let  be  a 

random  sample  distributed  like  U. 

Under  these  circumstances  the  essential  basis  for  inference 
is  contained  in  the  following  theorem. 

Theorem  (See,  e.g.,  DeGroct  (1970  ;p  .  17S )  .  ) 

The  posterior  joint  distribution  of  y  and  R  when 
U.  =  u,  (i*l,...,n)  is  as  follows:  The  conditional  distribution 
of  y  when  R  =  r  is  a  multivariate  normal  distribution  with  mean 
vector  y^  and  precision  matrix  (v^)r,  where  =  vQ  +  n  and 


The  marginal  distribution  of  R  is  a  Wishart  distribution 
with  a]_  =  aQ  +  n  degrees  of  freedom  ana  precision  matrix  t^,  where 


12 


-1  "  To+  ?  (!fo_y)(!;o“?)  '* 


and 


S  =  l  (U,-U) (U.-U) 


If  we  now  define  a  multivariate  quadratic  loss  function 
of  the  form 

L(  u  ,u  )  =  ( u-jT)  '  ( u-iT)  , 

we  can  obtain  the  Bayes  estimator  of  u  as  the  mean  of  the 
posterior  distribution  of  u,  namely, 


u 


u 


i  * 


Also  for  a  loss  function  of  the  form 

L($,f)  =  tr(f-f)2, 
the  Bayes  estimator  for  \  is 


I  a+n-2 (sw-l ) 
(See,  e.g..  Press  (  1972  ;pp.  l63f f )  .  ) . 


7.  BAYES  ESTIMATES  OF  THE  TRANSITION  PROBABILITIES 


We  consider  a  Bayesian  approach  to  estimation  of  the 
transition  probabilities  P  in  the  Markov  renewal  process. 
Martin  [13]  developed  a  methodology  for  Bayesian  inference 
about  the  transition  probabilities  of  a  Markov  chain.  He  noted 
that  in  this  case  the  observed  transition  counts  in  each  row 
of  P  come  from  a  multinomial  distribution  with  ceil  probabil¬ 
ities  depending  in  a  particular  way  on  P.  A  matrix  beta 


r 
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distribution  is  then  postulated  as  appropriate  for  a  prior 
distribution.  This  is  just  the  distribution  of  s  independent 
Dirichlet  random  vectors  (see  Johnson  and  Kotz  ( 197 2  ; p  .  233 )  • 

The  matrix  beta  distribution  is,  like  the  Dirichlet,  closed 
under  sampling  so  that  the  posterior  distribution  is  also  a 
member  of  the  matrix  beta  family. 

Specifically,  the  matrix  beta  distribution  is  determined 
by  a  matrix  parameter  M.  If  Mq  is  taken  to  be  the  matrix 
parameter  specifying  the  prior  distribution  and  a  matrix  of 
frequency  counts  N  is  observed,  then  the  posterior  distri¬ 
bution  of  P  is  matrix  beta  with  matrix  parameter  M*Mq+N. 

Under  quadratic  loss  the  Bayes  estimator  of  P  is  then  given 
by  the  posterior  expectation  of  P.  But  the  expectation  of 
a  random  matrix  P  having  a  matrix  beta  distribution  with 
matrix  parameter  M=Cm< ^ ]  is  simply, 

m.  . 

EP  =  [-±1]  (11) 

i+ 

s 

where  m.  =  Z  m. .  (See  Martin  [1967;  p.  17].) 

1  j=l 

This  same  approach  is  followed  by  Brock  [1971]  for  the  Markov 
renewal  process  when  the  waiting  time  distribution  are  assumed 
known . 

For  Bayesian  estimation  In  the  general  Markov  renewal 
process,  we  require,  according  to  equation  (10),  two  specific 
conditional  expectations,  E(P|Z^^)  and  E(PP 

-  ,  j  — 


The  first  conditional  expectation  is  available  directly  from 
equation  (11).  The  second  conditional  expectation  can  be 
computed  as  follows:  We  basically  require,  for  P  having  a 
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matrix  beta  distribution  with  matrix  parameter  M  =  [m  ] , 


the  expectation 


E(PP1J>  '  CEpruPij’- 


EP  P.,  =  Cov ( P  ,P,.)  +  EP  EP,, 
ru  ij  ru5  ij  '  ru  ij 


Now  from  Martin  [1967;p.l3]  we  have 


Cov(Pru'PiJ)  " 


m  m .  . 
ru  i.i 


(mr+)  (m1++l) 


i=r,j+u. 


;nru(:nl+-m1.1 1 
(mr+)2(m1++l) 


■,  i=r,j=u, 


,  otherwise. 


Therefore,  making  use  of  equation  (11)  and  simplifying, 
we  have 


( p  p  ) 

“uru  ij  ' 


m  m,  . 


m  (m, ,+l) 

-  -  .  i-r.J-u, 

">r+(ini*+1) 


m  m,  , 
ru  i: 


mr+mi+ 


,  i+r. 


Substituting  the  calculated  values  in  equations  (11)  and  (12) 
into  the  estimator  given  in  equation  (10)  we  obtain  after 
simplification. 
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where 


P 

ru 


m 


m 


ru 

r+ 


m 


=  a 


u  m 


ru 

r+ 


r/x 


n 


r=x 


n  * 


m 


1  - 


a  = 
u 


,b  j  m  ,+l 
j  /  u  J  xp+ 


- 


m  +1 
xnu 

u  m  +1 
n 


m 


1  - 


l  a*  — n- 

’  V 


j 


(13) 


It  is  easy  to  verify  directly  that  with  the  matrix  beta 
prior  distribution  either  the  Bayes  estimates  P  or  their 
approximations  P  have  row  sums  equal  to  one  and  have  non¬ 
negative  entries.  Hence  either  ?  or  T  can  be  used  as  a 
legitimate  transition  probability  matrix. 


8.  A  SIMULATION  EXAMPLE 

This  section  illustrates  the  application  of  the  statistical 
techniques  developed  in  the  previous  section  to  simulated  data. 
The  use  of  simulated  data  is  advantageous  at  this  point  because 
a  direct  comparison  can  be  made  between  results  based  on  esti¬ 
mated  process  parameters  and  true  process  parameters. 
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Data  from  a  3-state  Markov  renewal  process  with  log- 
normally  distributed  interarrival  times  were  simulated  on  an 
IBM  360/67  computer  at  Carnegie-Mellon  University.  The  tran¬ 
sition  probability  matrix  was 

.0  .1 

.0  *  3  • 

.3  .6 

To  give  concreteness  to  this  example,  imagine  that  state 
i=l,2,3  indicates  the  number  of  projects  that  an  analyse 
has  been  assigned  to  work  on  simultaneously.  Often  when  a 
project  is  completed,  it  will  be  replaced  immediately  with 
another  one.  Hence  there  is  high  probability  of  an  interstate 
move.  The  analyst  is  never  assigned  more  than  three  projects 
and  always  is  assigned  at  least  one.  The  length  of  time  a 
project  will  take  to  complete  is  a  random  variable.  t  is  pos¬ 
sible  to  complete  two  projects  simultaneously  and  two  additional 
projects  might  be  assigned  to  the  analyst  currently  working  on 
one.  The  9  waiting  time  distributions  between  transitions 
are  taken  to  be  lognormally  distributed.  The  means  u  and  pre¬ 
cisions  t  (reciprocal  of  the  variances)  of  these  distributions 
are  given,  correspondingly  to  the  entries  of  P ,  as  the  first 
line  of  each  of  9  cells  In  Table  1,  below. 
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TABLE  1 

True  Waiting  Time  Distribution  Parameters 


(1,  1) 

(1,  1/4) 

(2,  1/9) 

4.5 

20.1 

665.1 

2.7 

2.7 

7.4 

(1,  1/9) 

(2,1) 

(2,  1/4) 

244.7 

12.2 

54.6 

2.7 

7.4 

7.4 

(2,  1/9) 

(2,  1/9) 

(1,1) 

665-1 

665.1 

4 . 7 

7.4 

7.4 

2.7 

Thus,  for  example,  the  logarithm  of  the  time  between  a  trans¬ 
ition  from  state  2  to  state  3  was  normally  distributed  with  a 
mean  of  2  and  a  standard  deviation  of  2.  The  expected 
interarrival  times  T  can  be  calculated  from 

ET  =  exp  ( y  +  ■—) 

and  are  given  as  the  second  line  of  each  cell  in  Table  1. 

Since  the  lognormal  distribution  is  skew  the  median  interarrival 
times  exp  (u)  differ  from  the  mean  interarrival  times  and  are 
given  as  the  third  line  of  each  cell  in  Table  1.  In  our  illus¬ 
tration,  the  units  of  T  might  be  days. 

The  Bayes  procedures  outlined  in  the  previous  sections 
require  the  specification  of  the  parameters  of  the  prior  dis¬ 
tributions.  There  are  9  waiting  time  distributions,  each  of 
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which  requires  that  4  parameters,  u  ,  v  ,  a  ,  and  g  =■=•  t 

o  o  o  02® 

of  the  normal-gamma  conjugate  prior  distribution  be  set. 

We  will  illustrate  the  technique  with  independent  and  identical 

prior  distributions  on  all  waiting  time  distributions.  There 

is,  of  course,  no  reason  why  this  should  necessarily  be  the 

case  in  a  practical  application;  we  do  this  only  for  ease  of 

illustration.  In  order  to  give  a  reasonably  diffuse  prior 

specification,  while  not  being  too  far  off  from  the  target, 

we  choose  our  hyperparameters  to  be  u  =  1 ,  X  =l,a  =1, 

o  o  ’  o  ’ 

and  8Q  =  4.  This  leads  to  prior  (minimum  mean  squared  error) 
estimates  (no-data)  of  the  means  and  precisions  of  u  =  =  1 

and  t  1  =  (cXq/Bq)  ■*"  =  (1/4)  ^ .  Prior  estimated  expected  inter¬ 
arrival  times  can  then  be  calculated  as 


ET  =  exp  ( {i+  j  4)  =  20.086 

~  T 


and  the  estimated  median  waiting  time  would  be 


ey  =  2.718. 


Mote  that  these  estimated  means  and  medians  are  not  Bayes  with 
respect  to  squared  error  loss.  They  are  only  given  to  illustrate 
their  approximate  magnitude,  since  our  real  concern  is  with 
u  and  t  ^ .  Naturally  since  the  actual  expected  waiting  times 
vary  substantially  among  themselves  this  estimate  misses  some 
of  the  expected  waiting  times  by  a  substantial  margin. 

To  complete  the  specification  of  the  prior  parameters  we 
need  to  give  the  3x3  matrix  parameter  MQ  of  the  matrix  beta 
distribution  for  the  transition  probability  matrix  P.  We  cheese 
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5  3  2 

2  6  2 

2  3  5 

which  leads  to  prior  estimates  of 

'.5  .3 

P  ~  .2  .6 

..2  .3 

Note  that  it  is  not  necessary  for  the  row  sums  of  M  to  be 

10,  they  could  be  any  positive  number. 

We  now  wish  to  show  how  these  prior  estimates  will  be 
modified  with  data.  Simulated  data  of  91  transitions  (be¬ 
ginning  in  state  1),  together  with  their  associated  lognormaliy 
distributed  waiting  times  are  used.  The  last  state  occupied 

was  x  =2  and  the  observed  time  in  this  state  before  termina- 
n 

tion  was  w  *  21.3.  The  stopping  rule  used  was  to  stop  when 
the  total  observation  time  was  631.3.  The  "observed"  transi¬ 
tion  counts,  based  on  the  simulation,  are  given  by 

'27  3 

N  =  7  20  13 

2  12  25 

Thus  the  maximum  likelihood  estimate  of  the  transition  probabil¬ 
ity  matrix  is 

.167  .533  .250 

.175  -500  .325 

.051  .308  .641 
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Table  2  gives  the  maximum  likelihood  estimates  of  the 
parameters  of  the  lognormal  distributions  of  waiting  times. 

The  format  of  the  table  corresponds  exactly  to  that  of  Table  1. 
Thus  the  maximum  likelihood  estimate  of  the  mean  waiting  time 
for  a  transition  from  state  2  to  state  3  is  533-215,  while  a 
comparison  with  Table  1  shows  the  true  mean  waiting  time  to 
be  54.6. 


TABLE  2 

Maximum  Likelihood  Estimates  of  the  Waiting 
Time  Distribution  Parameters 


(1.401,  4.120) 

(-0.139,  0.605) 

— 

(3.346,  0.090) 

4.583 

1.989 

12,107.2 

4.059 

.370 

46.S05 

(1.774,  0.533) 

(2.013,  1.22C) 

(2.256,  0.124) 

1  -U  .  930 

1 1  •  -  1  0 

53S . 215 

5.594 

7.436 

9.345 

(-0.045,  6.331) 

(1.696,  0.203) 

(1.073,  0.673) 

1.035 

60.331 

6.173 

.  Q56 

5.452 

2.939 

Employment  of  the  Bayesian  methods  discussed  in  Section  6 
produced  the  estimates  9  of  the  parameters  of  the  waiting 
time  distributions  contained  in  Table  3-  Table  3  has  the  same 


format  at  Tables  1  and  2. 


We  have  simply  displayed  the 
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estimated  expected  waiting  times  as  exp(u  +  4  t~j‘)  and  the 
estimated  median  waiting  times  as  exp  (  y  ) ,  since  our  interest 
centers  on  minimum  mean  squared  error  estimation  of  y  and  t. 


TABLE  3 

Bayes  Estimates  of  the  Waiting  Time 
Distribution  Parameters 


|  (1.267,  .  966 ) 

(.003,  .935) 

(3-135,  .105) 

10.396 

3.710 

2,631.507 

1  0-0^ 

1  .  cc 3 

,5  - | 

.  .. 

!  /  i  i'77  ;,i 

{  +  •  0  i  1  ,  .-*10; 

(1.965,  .567) 

,,  -cc  i 

^  j 

!~.702 

I 

r  •  -> 

i 

5  •  351 

•  *  ~  t  J 

=.-26  j 

(.303,  .  9 9  2  } 

(1.692,  .212) 

! 

( -  67;  C  C  ^ ' 

\  •  vj  1  ^  y  •  ✓  O  y 

4.I95 

39.0-9 

6.761 

1  •  35J 

__ 

5 . 163 

2.930 

_ _ _ 1 

Comparison  cf  Tables  1,  2,  and  3  plus  the  prior  estimates 
shows  that  (1)  the  posterior  3ayes  estimates  effectively 
restrain  the  rather  extreme  fluctuation  that  afflict  these 
small  sample  maximum  likelihood  estimates,  and  (2)  the  posterior 
Bayes  estimates  are  overall  slightly  better  than  both  the  prior 
Bayes  estimates  and  the  maximum  likelihood  estimates. 

Ignoring  the  observed  terminal  waiting  time  in  state  2  of 
w  =  21.3,  the  estimated  transition  probability  matrix  is 


o 


Incorporating  the  observed  terminal  waiting  time  according 

to  the  procedure  of  Section  7  requires  that  we  calculate 
0 

?*,C2I.3)  for  J-1,2,3  to  incorporate  in  the  estimator  given 

*—  c 

by  equation  (13)-  These  calculated  values  are 

=  .  333,  F,_,  =  .  Sup,  ana  =  .6  59  , 

and  hence  the  estimate  ?  is 


This  compares  favorably  with  the  maximum  likelihood  estimates 
and  the  prior  estimates. 
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